听诊器录制的胸部声音为新生儿的偏远有氧呼吸健康监测提供了机会。然而,可靠的监控需要高质量的心脏和肺部声音。本文介绍了新生胸部声音分离的新型非负基质分子(NMF)和非负矩阵协同分解(NMCF)方法。为了评估这些方法并与现有的单源分离方法进行比较,产生人工混合物数据集,包括心脏,肺和噪音。然后计算用于这些人造混合物的信噪比。这些方法也在现实世界嘈杂的新生儿胸部声音上进行测试,并根据生命符号估计误差评估,并在我们以前的作品中发达1-5的信号质量得分。此外,评估所有方法的计算成本,以确定实时处理的适用性。总的来说,所提出的NMF和NMCF方法都以2.7db到11.6db的下一个最佳现有方法而言,对于人工数据集,0.40至1.12的现实数据集的信号质量改进。发现10S记录的声音分离的中值处理时间为NMCF和NMF的342ms为28.3。由于稳定且稳健的性能,我们认为我们的提出方法可用于在真实的环境中弃绝新生儿心脏和肺部。提出和现有方法的代码可以在:https://github.com/egrooby-monash/heart-and-lung-sound-eparation。
translated by 谷歌翻译
We present Azimuth, an open-source and easy-to-use tool to perform error analysis for text classification. Compared to other stages of the ML development cycle, such as model training and hyper-parameter tuning, the process and tooling for the error analysis stage are less mature. However, this stage is critical for the development of reliable and trustworthy AI systems. To make error analysis more systematic, we propose an approach comprising dataset analysis and model quality assessment, which Azimuth facilitates. We aim to help AI practitioners discover and address areas where the model does not generalize by leveraging and integrating a range of ML techniques, such as saliency maps, similarity, uncertainty, and behavioral analyses, all in one tool. Our code and documentation are available at github.com/servicenow/azimuth.
translated by 谷歌翻译
国际危机如何展开?我们将国际关系概念化为对手之间的战略国际象棋游戏,并开发了一种系统的方法,以准确且一致的历史准确,一致地测量碎片,移动和gam。我们基于国际危机行为(ICB)项目的非常高质量的叙事语料库,介绍了一个名为ICBE的国际事件的新本体和数据集。我们证明,ICBE的覆盖范围,召回和精度比现有数据集的现有状态更高,并进行了两项关于古巴导弹危机(1962)和Crimea-Donbas危机(2014)的详细案例研究。我们进一步介绍了两个新的事件可视化(事件Icongraphy和危机地图),这是一种使用自然语言处理(Sythnetic叙述)测量事件召回的自动基准,以及用于客观测量事件精确度的本体论重建任务。我们在伴侣网站www.crisisevents.org和github存储库中提供数据,在线附录,复制材料以及可视化的可视化材料和可视化。
translated by 谷歌翻译
现实的3D室内场景数据集在计算机视觉,场景理解,自主导航和3D重建中启用了最近的最近进展。但是,现有数据集的规模,多样性和可定制性有限,并且扫描和注释更多的耗时和昂贵。幸运的是,组合者在我们方面:现有3D场景数据集有足够的个别房间,如果有一种方法可以将它们重新组合成新的布局。在本文中,我们提出了从现有3D房间生成新型3D平面图的任务。我们确定了这个问题的三个子任务:生成2D布局,检索兼容3D房间,以及3D房间的变形,以适应布局。然后,我们讨论解决问题的不同策略,设计两个代表性管道:一个使用可用的2D楼层计划,以指导3D房间的选择和变形;另一个学习检索一组兼容的3D房间,并将它们与新颖的布局相结合。我们设计一组指标,可评估所生成的结果与三个子任务中的每一个,并显示不同的方法在这些子任务上交易性能。最后,我们调查从生成的3D场景中受益的下游任务,并讨论选择最适合这些任务的需求的方法。
translated by 谷歌翻译
使用加强,监督和无监督学习培训的人工神经系统培训全部获取高维输入的内部表示。这些表现在多大程度上取决于不同的学习目标在很大程度上是未知的。在这里,我们将八个不同的卷积神经网络学到的表示,每个都具有相同的reset架构,并在同一个自我图像的图像上培训,而是嵌入在不同的学习系统中。具体地,培训表示以在复合增强学习任务中引导动作;预测三个与监督有三个任务相关目标的组合;或者使用三种不同无监督的目标之一。使用代表性相似性分析,我们发现,通过加强学习培训的网络与其他网络的不同之处不同。通过进一步的分析,使用由神经科学文献的灵感的度量,我们发现用加强学习训练的模型具有稀疏和高维表示,其中单个图像用非常不同的神经活动模式表示。进一步的分析表明,这些陈述可能出现,以指导在RL代理中的长期行为和目标寻求。我们的结果提供了探讨神经表征的特性如何受目标职能影响,并可以告知转移学习方法。
translated by 谷歌翻译
关注关键背景中的过度嗜睡可能导致不良事件,例如汽车崩溃。检测和监测嗜睡可以帮助防止这些不良事件发生。在本文中,我们使用Voice DataSet从1,828名参与者提取语音,使用隐藏单元BERT(HUBERT)语音表示来开发深度传输学习模型,以检测个人的嗜睡。言语是睡眠检测中的利用率的数据来源,但由于语音收集方便,具有成本效益和非侵入性,因此提供了嗜睡检测的有希望的资源。进行了两种互补技术,以便寻求有关个别讲话任务的重要性的融合证据。我们的第一种技术,屏蔽,通过组合所有语音任务,掩盖语音中的选择响应并观察模型精度的系统变化来评估任务。我们的第二种技术,单独培训,比较多种型号的准确性,每个模型使用相同的架构,但是训练在不同的语音任务子集上。我们的评价表明,最佳性能的模型利用来自波士顿命名试验的记忆召回任务和分类命名任务,其达到了80.07%(F1分数为0.85)的准确性和81.13%(F1分数为0.89) 。
translated by 谷歌翻译
放射线学使用定量医学成像特征来预测临床结果。目前,在新的临床应用中,必须通过启发式试验和纠正过程手动完成各种可用选项的最佳放射组方法。在这项研究中,我们提出了一个框架,以自动优化每个应用程序的放射线工作流程的构建。为此,我们将放射线学作为模块化工作流程,并为每个组件包含大量的常见算法。为了优化每个应用程序的工作流程,我们使用随机搜索和结合使用自动化机器学习。我们在十二个不同的临床应用中评估我们的方法,从而在曲线下导致以下区域:1)脂肪肉瘤(0.83); 2)脱粘型纤维瘤病(0.82); 3)原发性肝肿瘤(0.80); 4)胃肠道肿瘤(0.77); 5)结直肠肝转移(0.61); 6)黑色素瘤转移(0.45); 7)肝细胞癌(0.75); 8)肠系膜纤维化(0.80); 9)前列腺癌(0.72); 10)神经胶质瘤(0.71); 11)阿尔茨海默氏病(0.87);和12)头颈癌(0.84)。我们表明,我们的框架具有比较人类专家的竞争性能,优于放射线基线,并且表现相似或优于贝叶斯优化和更高级的合奏方法。最后,我们的方法完全自动优化了放射线工作流的构建,从而简化了在新应用程序中对放射线生物标志物的搜索。为了促进可重复性和未来的研究,我们公开发布了六个数据集,框架的软件实施以及重现这项研究的代码。
translated by 谷歌翻译
The Common Voice corpus is a massively-multilingual collection of transcribed speech intended for speech technology research and development. Common Voice is designed for Automatic Speech Recognition purposes but can be useful in other domains (e.g. language identification). To achieve scale and sustainability, the Common Voice project employs crowdsourcing for both data collection and data validation. The most recent release includes 29 languages, and as of November 2019 there are a total of 38 languages collecting data. Over 50,000 individuals have participated so far, resulting in 2,500 hours of collected audio. To our knowledge this is the largest audio corpus in the public domain for speech recognition, both in terms of number of hours and number of languages. As an example use case for Common Voice, we present speech recognition experiments using Mozilla's DeepSpeech Speech-to-Text toolkit. By applying transfer learning from a source English model, we find an average Character Error Rate improvement of 5.99 ± 5.48 for twelve target languages (German, French, Italian, Turkish, Catalan, Slovenian, Welsh, Irish, Breton, Tatar, Chuvash, and Kabyle). For most of these languages, these are the first ever published results on end-to-end Automatic Speech Recognition.
translated by 谷歌翻译
Benefiting from the intrinsic supervision information exploitation capability, contrastive learning has achieved promising performance in the field of deep graph clustering recently. However, we observe that two drawbacks of the positive and negative sample construction mechanisms limit the performance of existing algorithms from further improvement. 1) The quality of positive samples heavily depends on the carefully designed data augmentations, while inappropriate data augmentations would easily lead to the semantic drift and indiscriminative positive samples. 2) The constructed negative samples are not reliable for ignoring important clustering information. To solve these problems, we propose a Cluster-guided Contrastive deep Graph Clustering network (CCGC) by mining the intrinsic supervision information in the high-confidence clustering results. Specifically, instead of conducting complex node or edge perturbation, we construct two views of the graph by designing special Siamese encoders whose weights are not shared between the sibling sub-networks. Then, guided by the high-confidence clustering information, we carefully select and construct the positive samples from the same high-confidence cluster in two views. Moreover, to construct semantic meaningful negative sample pairs, we regard the centers of different high-confidence clusters as negative samples, thus improving the discriminative capability and reliability of the constructed sample pairs. Lastly, we design an objective function to pull close the samples from the same cluster while pushing away those from other clusters by maximizing and minimizing the cross-view cosine similarity between positive and negative samples. Extensive experimental results on six datasets demonstrate the effectiveness of CCGC compared with the existing state-of-the-art algorithms.
translated by 谷歌翻译
As one of the prevalent methods to achieve automation systems, Imitation Learning (IL) presents a promising performance in a wide range of domains. However, despite the considerable improvement in policy performance, the corresponding research on the explainability of IL models is still limited. Inspired by the recent approaches in explainable artificial intelligence methods, we proposed a model-agnostic explaining framework for IL models called R2RISE. R2RISE aims to explain the overall policy performance with respect to the frames in demonstrations. It iteratively retrains the black-box IL model from the randomized masked demonstrations and uses the conventional evaluation outcome environment returns as the coefficient to build an importance map. We also conducted experiments to investigate three major questions concerning frames' importance equality, the effectiveness of the importance map, and connections between importance maps from different IL models. The result shows that R2RISE successfully distinguishes important frames from the demonstrations.
translated by 谷歌翻译